An expectation maximization algorithm for training hidden substitution models.

نویسندگان

  • I Holmes
  • G M Rubin
چکیده

We derive an expectation maximization algorithm for maximum-likelihood training of substitution rate matrices from multiple sequence alignments. The algorithm can be used to train hidden substitution models, where the structural context of a residue is treated as a hidden variable that can evolve over time. We used the algorithm to train hidden substitution matrices on protein alignments in the Pfam database. Measuring the accuracy of multiple alignment algorithms with reference to BAliBASE (a database of structural reference alignments) our substitution matrices consistently outperform the PAM series, with the improvement steadily increasing as up to four hidden site classes are added. We discuss several applications of this algorithm in bioinformatics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Training Asynchronous Input/Output Hidden Markov Models

In learning tasks in which input sequences are mapped to output sequences, it is often the case that the input and output sequences are not synchronous. For example, in speech recognition , acoustic sequences are longer than phoneme sequences. Input/Output Hidden Markov Models have already been proposed to represent the distribution of an output sequence given an input sequence of the same leng...

متن کامل

Discriminative training using the trusted expectation maximization

We present the Trusted Expectation-Maximization (TEM), a new discriminative training scheme, for speech recognition applications. In particular, the TEM algorithm may be used for Hidden Markov Models (HMMs) based discriminative training. The TEM algorithm has a form similar to the ExpectationMaximization (EM) algorithm, which is an efficient iterative procedure to perform maximum likelihood in ...

متن کامل

VQ - EM - Training - CodebookVQ - EM - Testing - 6 ? BAUM

This paper presents a scheme of speaker-independent isolated word recognition in which Hidden Markov Modelling is used with Vector Quantization codebooks constructed using the Expectation-Maximization (EM) algorithm for Gaussian mixture models. In comparison with conventional vector quantization, the EM algorithm results in greater recognition accuracy.

متن کامل

Noise Benefits in Expectation-Maximization Algorithms

This dissertation shows that careful injection of noise into sample data can substantially speed up Expectation-Maximization algorithms. Expectation-Maximization algorithms are a class of iterative algorithms for extracting maximum likelihood estimates from corrupted or incomplete data. The convergence speed-up is an example of a noise benefit or"stochastic resonance"in statistical signal proce...

متن کامل

Particle Swarm Optimization for Hidden Markov Models with application to Intracranial Pressure analysis

The paper presents new application of Particle Swarm Optimization for training Hidden Markov Models. The approach is verified on artificial data and further, the application to Intracranial Pressure (ICP) analysis is described. In comparison with Expectation Maximization algorithm, commonly used for the HMM training problem, the PSO approach is less sensitive on sticking to local optima because...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of molecular biology

دوره 317 5  شماره 

صفحات  -

تاریخ انتشار 2002